Last updated: 2021-02-08
Checks: 7 0
Knit directory: Supplemenetary_data_files_automation_v2/
This reproducible R Markdown analysis was created with workflowr (version 1.6.2). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.
Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
The command set.seed(20201119) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.
Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.
Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.
The results in this page were generated with repository version 555b1a2. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.
Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:
Ignored files:
Ignored: .Rhistory
Ignored: .Rproj.user/
Ignored: code/Construction/
Ignored: data/Inputs/Large_tables/
Untracked files:
Untracked: DELETE.jpeg
Untracked: Rplot.jpeg
Untracked: analysis/graphing_heierachies.rmd
Untracked: delete/
Unstaged changes:
Modified: analysis/SUT_presentation_RY2018.Rmd
Modified: analysis/cleaning_SEPH.Rmd
Modified: analysis/supp_files.rmd
Modified: analysis/technical_R_tips.Rmd
Modified: code/Publishing_script.R
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
These are the previous versions of the repository in which changes were made to the R Markdown (analysis/Provincial_view.rmd) and HTML (docs/Provincial_view.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.
| File | Version | Author | Date | Message |
|---|---|---|---|---|
| Rmd | 555b1a2 | arman | 2021-02-08 | wflow_publish(c(“analysis/Provincial_view.rmd”, “analysis/index.Rmd”)) |
| html | acae2de | arman | 2021-02-08 | Build site. |
| Rmd | 5b9cbf6 | arman | 2021-02-08 | fixed the date |
| html | c978ec7 | arman | 2021-02-08 | Build site. |
| html | 210b49e | arman | 2021-02-08 | Build site. |
| Rmd | 39b2d18 | arman | 2021-02-08 | wflow_publish(c(“analysis/Provincial_view.rmd”, “analysis/index.Rmd”)) |
| Rmd | 05f2e30 | arman | 2021-02-08 | Finished LFS |
| Rmd | 03e788f | arman | 2021-02-07 | fixed the employment |
| Rmd | 07cced5 | arman | 2021-02-05 | added employment by industury |
| html | 3077a6a | arman | 2021-02-04 | Build site. |
| Rmd | 3a7a11e | arman | 2021-02-04 | wflow_publish(c(“analysis/Provincial_view.rmd”, “analysis/index.Rmd”)) |
| html | 0276906 | arman | 2021-02-04 | Build site. |
| Rmd | 8fdc2eb | arman | 2021-02-04 | added CPI with goods services and all items |
| Rmd | 41617a0 | arman | 2021-02-04 | ADDED the tables for prices/agriculture |
| html | f449255 | arman | 2021-02-03 | Build site. |
| Rmd | 247d0fd | arman | 2021-02-03 | wflow_publish(c(“analysis/Provincial_view.rmd”, “analysis/index.Rmd”)) |
| Rmd | 3db0059 | arman | 2021-02-03 | added back the provinces to filter construction on; created and used across to add _K to all the columns and also to create summary annual tables easier |
| Rmd | 88a4612 | arman | 2021-02-02 | added a simple version of construction; will need some refinement. construction is rather complicated |
| Rmd | a105189 | arman | 2021-02-01 | addded Immigration and a method to cacluate any tibbles first difference and percent change |
| html | 6a066ff | arman | 2021-01-29 | Build site. |
| html | ad426b3 | arman | 2021-01-29 | Build site. |
| Rmd | aa8d76b | arman | 2021-01-29 | wflow_publish(c(“analysis/Provincial_view.rmd”, “analysis/index.Rmd”)) |
| Rmd | ab73d32 | arman | 2021-01-29 | Completed the master table for Trade population and PGDP |
| Rmd | 777e601 | arman | 2021-01-28 | created an annualized trade table. |
| Rmd | dcf20f4 | arman | 2021-01-27 | finished monthly trade with YOY % change. |
| Rmd | 974fcc7 | arman | 2021-01-26 | now measuring year over year average; |
| Rmd | 3deba0d | arman | 2021-01-26 | NDM has no consistent way of displaying dates; each table needs to be parsed in a custom manner. in this commit I use the parse date function in tidyverse to create a custom parse function for the dates in the trade table. hopefully I can use this to parse more |
| html | ffc28e9 | arman | 2021-01-22 | Build site. |
| Rmd | d13d74f | arman | 2021-01-22 | scraped the params and replaced a piece of r code; got the left join to work and merged the PGDP and population table |
| Rmd | 704ee1d | arman | 2021-01-21 | fuxed the developer_mode |
| Rmd | 76913f8 | arman | 2021-01-21 | testing new creds |
| Rmd | aaaadf9 | arman | 2021-01-21 | set up the first table. it all works |
| Rmd | 2ae6621 | arman | 2021-01-20 | added tables and the developer mode for including stuff in the document |
| Rmd | edeeade | arman | 2021-01-19 | ported the fucntions and set up added code lists and functions to be used for the project. |
| html | ac1cbff | arman | 2021-01-19 | Build site. |
| html | 1771361 | arman | 2021-01-19 | Build site. |
| html | 49fbbde | arman | 2021-01-19 | Build site. |
| Rmd | 8538afc | arman | 2021-01-19 | starting the official provincial view |
in this document I will create a view based on the following tables
contribution to total economy
population 17-10-0009-01.
immigration 17-10-0008-01.
labor force Statistics Canada, table 14-10-0327-01.
Farms, by operation type 32-10-0403-01.
Aquaculture in Canada 32-10-0107-01.
Manufacturing industries 16-10-0117-01.
International merchandise trade by province, commodity, and Principal Trading Partners 12-10-0119-01
Investment in Building Construction(monthly) 34-10-0175-01
Building permits, by type of structure and type of work 34-10-0066-01
in this section I will set up the foundation for the program; it is broke up into 3 parts; package dependency and path; functions,and finally code lists.
in this section I will manage dependencies ( the packages used in this project; in R you need to state what you will use (kinda) and here we will simply state them) these will have to be installed for the program to run properly. if you run this in the cloud the commands below should be sufficient. if you uncomment the first 5 lines
all the files in this program will now be created relative to where you run this script; it will create all the necessary subflorder and files from the location below
here()
[1] "C:/Users/Arman/Documents/Statcan/R_projects/Supplemenetary_data_files_automation_v2"
stating what the code lists are for the project to sort and query data from; if you want to add some thing for the table to include you can simple add it here and it should work;. it is designed this way to make this program more extendable, and maintainable. the added complexity is worth it
if you have any tables to add please add it here; it will make it easier to maintain this program
here I will describe the functions required in the project. the most important one is the one for creating interactive tables in the browser that allow you to change which columns to show and to search through tables.
it is important for the search function and performance for the columns to be categories and/or numerics.
TODO: find a way to handle sig dig
i will attempt to import and clean all the tables ### PGDP here I will be working with Provincial GDP
pgdp_table_number <- "36-10-0402"
pgdp <- cansim::get_cansim_ndm(pgdp_table_number)
Accessing CANSIM NDM product 36-10-0402 from Statistics Canada
Parsing data
Folding in metadata
pgdp <- janitor::clean_names(pgdp)
pgdp %>%
select(ref_date, geo, north_american_industry_classification_system_naics, value, hierarchy_for_north_american_industry_classification_system_naics, coordinate, classification_code_for_north_american_industry_classification_system_naics, value_2) %>%
mutate(ref_date = lubridate::make_date(ref_date)) %>%
rename(CPC = value_2) %>%
filter(ref_date > from_this_year) %>%
filter(value == "Contributions to percent change") %>%
select(!value) %>%
filter(classification_code_for_north_american_industry_classification_system_naics %in% c(two_digit_naics_code)) %>%
group_by(ref_date, north_american_industry_classification_system_naics, geo) -> clean_PGDP
# maybe wait till the end to convert the tables to save some space
interactive_table(clean_PGDP)
# data.table::fwrite(here("output", str_glue("Table_{tablenumber}_CPI_by_province_product_groups_{params$from_this_year}-{params$to_this_year}")))
pgdp
# A tibble: 302,289 x 21
ref_date geo dguid value north_american_~ uom uom_id scalar_factor
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 1997 Newf~ 2016~ Curr~ All industries ~ Doll~ 81 millions
2 1997 Newf~ 2016~ Curr~ Goods-producing~ Doll~ 81 millions
3 1997 Newf~ 2016~ Curr~ Service-produci~ Doll~ 81 millions
4 1997 Newf~ 2016~ Curr~ Industrial prod~ Doll~ 81 millions
5 1997 Newf~ 2016~ Curr~ Non-durable man~ Doll~ 81 millions
6 1997 Newf~ 2016~ Curr~ Durable manufac~ Doll~ 81 millions
7 1997 Newf~ 2016~ Curr~ Information and~ Doll~ 81 millions
8 1997 Newf~ 2016~ Curr~ Information and~ Doll~ 81 millions
9 1997 Newf~ 2016~ Curr~ Information and~ Doll~ 81 millions
10 1997 Newf~ 2016~ Curr~ Energy sector [~ Doll~ 81 millions
# ... with 302,279 more rows, and 13 more variables: scalar_id <chr>,
# vector <chr>, coordinate <chr>, value_2 <dbl>, status <chr>, symbol <chr>,
# terminated <chr>, decimals <chr>, geo_uid <chr>,
# classification_code_for_value <chr>, hierarchy_for_value <chr>,
# classification_code_for_north_american_industry_classification_system_naics <chr>,
# hierarchy_for_north_american_industry_classification_system_naics <chr>
pgdp %>%
select(ref_date, geo, north_american_industry_classification_system_naics, value, value_2) %>%
mutate(ref_date = lubridate::make_date(ref_date)) %>%
rename(CPC = value_2) %>%
filter(ref_date > from_this_year) %>%
filter(value == "Contributions to percent change") %>%
select(!value) %>%
filter(north_american_industry_classification_system_naics == "All industries [T001]") %>%
pivot_wider(names_from = north_american_industry_classification_system_naics, values_from = CPC) %>%
janitor::clean_names() %>%
rename(CPC_All_industries = all_industries_t001) -> master_pgdp
library(summarytools)
Registered S3 method overwritten by 'pryr':
method from
print.bytes Rcpp
For best results, restart R session and update pander using devtools:: or remotes::install_github('rapporter/pander')
Attaching package: 'summarytools'
The following object is masked from 'package:tibble':
view
population <- cansim::get_cansim_ndm(population_table_number)
Accessing CANSIM NDM product 17-10-0005 from Statistics Canada
Parsing data
Folding in metadata
population %>%
janitor::clean_names() %>%
select(1, 2, 4, 5, 12) %>%
rename(population = value) %>%
mutate(ref_date = lubridate::make_date(ref_date)) %>%
filter(ref_date > from_this_year) %>%
filter(age_group == "All ages") %>%
filter(sex == "Both sexes") %>%
select(1, 2, 5) -> clean_population
# view(dfSummary(clean_population))
interactive_table(clean_population)
This is the YOY percentage change difference for the monthly trade by NAPC I will create the annual series tomorrow. simply sum it over the year. it seems to be what the chief economist did and also it makes sense since it is already seasonally adjusted
trade <- cansim::get_cansim_ndm(trade_table_number)
Accessing CANSIM NDM product 12-10-0119 from Statistics Canada
Parsing data
Folding in metadata
trade <- janitor::clean_names(trade)
trade %>%
select(ref_date, geo, trade, north_american_product_classification_system_napcs, principal_trading_partners, value) %>%
mutate(ref_date = parse_date(ref_date, "%Y-%m")) %>%
filter(ref_date > from_this_year) %>%
pivot_wider(names_from = trade, values_from = value) %>%
janitor::clean_names() %>%
arrange(north_american_product_classification_system_napcs, principal_trading_partners, geo, ref_date) %>%
group_by(north_american_product_classification_system_napcs, principal_trading_partners, geo) %>%
mutate(yoy_pct_change_import = ((import / lag(import, n = 12, order_by = ref_date)) * 100) - 100) %>%
mutate(yoy_first_dif_import = (import - lag(import, n = 12, order_by = ref_date))) %>%
mutate(yoy_pct_change_export = ((domestic_export / lag(domestic_export, n = 12, order_by = ref_date)) * 100) - 100) %>%
mutate(yoy_first_dif_export = (domestic_export - lag(domestic_export, n = 12, order_by = ref_date))) %>% interactive_table()
Warning in instance$preRenderHook(instance): It seems your data is too big
for client-side DataTables. You may consider server-side processing: https://
rstudio.github.io/DT/server.html